Skip to content

Proof Systems: Chaining, Treelike Proofs, and Resolution

Database/feature-matrix view of a knowledge base (DF)

If the knowledge base is a set of formulas (hypotheses) $KB=\lbrace h_1,\dots,h_k\rbrace $, we can also view it as the set of worlds consistent with all of them:

[ DF_{KB} := \lbrace \bar x\in X : \forall i\in[k],\ h_i(\bar x)=1\rbrace ] Equivalently, \(DF_{KB}\) is the “database” of all assignments that satisfy the conjunction \(\bigwedge_{i=1}^k h_i\).

Tiny example (two Boolean attributes)

Let the attributes be dog and furry animal. The rule

\[ \text{dog}\Rightarrow\text{furry animal} \]

eliminates exactly the world where dog=1 and furry animal=0. So in a 2×2 grid (columns: dog=0/1, rows: furry=0/1), the only forbidden cell is (dog=1, furry=0); the other 3 cells are in \(DF\).

A set of possible world \(X\) corresponding to the setting of \(n\) Boolean attributes, e.g., “dog”, “furry animal”. Actual possibilities \(D\subseteq X\), the set $\lbrace \bar x\in X, \text{"dog"}\implies\text{"furry animal"}\rbrace $ contains all of \(D\), we say it is valid for \(D\), \(D\models\text{"dog"}\implies\text{"furry animal"}\)

Declarative Knowledge

Formulas (or other representations) that are valid for \(D\).

Knowledge Base

We’ll be given a set of representations $\lbrace h_1,\dots,h_k\rbrace $ called a knowledge base.

Implicitly, the knowledge base (KB) corresponds to that formula \(\bigwedge_{i=1}^{k}h_i\).

Implicitly, if the \(h_1,\dots,h_k\) are my knowledge base, we have \(D\models\bigwedge_{i=1}^{k}h_i\)

Reasoning Task

Given classes of representations, \(Q\) (queries) and \(H\) (hypotheses). The reasoning task is given a \(KB\subseteq H\), query \(q\in Q\), and it should output “accept” or “reject”.

If an algorithm only outputs “accept” when \(KB\models q\), we say it is sound.

If an algorithm outputs “accept” whenever \(KB\models q\), we say it is complete.

Observe

KB does not entail \(q\) if and only if there is \(\bar x^*\in X\) such that

  1. \(q(\bar x^*)=0\)
  2. \(\forall h_i\in KB, h_i(\bar x^*)=1\)

So, deciding whether \(KB\models q\) is equivalent to whether \(\lnot q(x)\land \left(\bigwedge^k_{i=1}h_i(x)\right)\) is satisfiable

So, if \(H\) contains all clauses of 3 literals (ORs of less than or equal to 3 variables or negations of variables) and \(Q\) has anything non-trivial (e.g., a tautology literal). 3 CNF satisfiability (3 SAT) is a special case of reasoning for \(Q+H\) where 3 SAT is NP-complete (hence, intractable).

Thus, the complete requirement of the reasoning task can be weakened to provability in a proof system

Proof System

A proof system is given by a class of representations \(J\) and a relations \(R\in J\times J^*\) (\(J^*\) is the finite length of \(\phi\in J\)). A proof of \(\phi\in J\) from premises \(h_1,\dots,h_k\in J\) is a sequence \(\pi=\pi_1,\dots,\pi_l\) where \(\pi_i\in J\) and for each \(i\in[1,l]\), there is a subsequence \(\pi'_1,\dots,\pi'_r\) from \(\pi_1,\dots,\pi_{i-1}\) such that \((\pi'_i,\dots,\pi'_r)\in R\) or $\pi_i \in\lbrace h_1,\dots,h_k\rbrace $ with \(\pi_l=\phi\).

If \((\pi_i)\in R\) (i.e., with \(r=0\) prior lines) we say it is an axiom.

Otherwise, it follows from rule of inference.

We say that the proof system is sound if whenever \((\pi_1, \pi'_1,\dots,\pi'_r)\) then \(\lbrace \pi'_1,\dots,\pi'_r\rbrace \models\pi_1\).

In particular, axioms must be tautologies (true for all \(\bar x\in X\))

If the proof system is sound, and there is a proof of \(\phi\) from $KB=\lbrace h_1,\dots,h_k\rbrace $, then \(KB\models \phi\).

Chaining (Example of Proof System)

\(J\) contains formulas of the form, \([\bigwedge^r_{i=1}l_i]\implies l^*\) for literals \(l_1,\dots,l_r,l^*\) “rules” (when \(r\geq 1\)) and \(l^*\) “facts” (\(r=0\)).

\(R\) contains \(\left(l^*, [\bigwedge^r_{i=1}l_i]\implies l^*, l_1, \dots, l_r\right)\). i.e., can conclude \(l^*\) from rules + body literals.

Example: Chaining proof of mischief

Knowledge base (rules + facts):

  • \([\text{dog}\land \text{young}] \Rightarrow \text{puppy}\)
  • \(\text{puppy} \Rightarrow \text{mischief}\)
  • \(\text{dog}\)
  • \(\text{young}\)

A chaining proof of mischief is:

  1. From \(\text{dog}\) and \(\text{young}\) and \([\text{dog}\land \text{young}]\Rightarrow \text{puppy}\), infer \(\text{puppy}\).
  2. From \(\text{puppy}\) and \(\text{puppy}\Rightarrow \text{mischief}\), infer \(\text{mischief}\).

So there is a proof of mischief from $\lbrace [\text{dog}\land \text{young}]\Rightarrow \text{puppy},\ \text{puppy}\Rightarrow \text{mischief},\ \text{dog},\ \text{young}\rbrace $.

Limitation: chaining cannot derive “new negations”

Consider the KB:

  • \([\text{dog}\land \text{young}] \Rightarrow \text{puppy}\)
  • \(\text{dog}\)
  • \(\lnot\text{puppy}\)

Semantically, this KB entails \(\lnot\text{young}\) (because if dog is true and puppy is false, then the antecedent dog ∧ young must be false, hence young must be false).

However, chaining cannot prove \(\lnot\text{young}\) here: chaining only concludes - given facts, or - literals that appear in the heads (consequents) of rules.

Treelike proofs

It’s often helpful to draw a proof as a dependency graph: edges point from a derived line to the earlier lines used to justify it.

If the proof graph is a tree (equivalently: every earlier line is used at most once to justify a later step), we say the proof is treelike.

Resolution (Example of Proof System)

Let \(J\) be the set of clauses, i.e., disjunctions (ORs) of literals.

The resolution proof system has the inference rule (sometimes called cut):

\[ \frac{(x_i \lor C)\quad (\lnot x_i \lor D)}{C \lor D} \]

That is: given a pair of clauses containing a complementary pair \(x_i\) and \(\lnot x_i\), we may conclude the OR of the “rest of the literals”, dropping \(x_i\) and \(\lnot x_i\).

Why the rule is sound

Claim: [ \lbrace x_i \lor C,\ \lnot x_i \lor D\rbrace \models C \lor D. ]

Proof (by cases on \(x_i\)):

  • If \(x_i=0\), then \(x_i\lor C\) being true implies \(C\) is true, hence \(C\lor D\) is true.
  • If \(x_i=1\), then \(\lnot x_i\lor D\) being true implies \(D\) is true, hence \(C\lor D\) is true.

So the inference preserves logical consequence.

Using resolution for proof by contradiction of a DNF query

Typically, we use resolution to refute the negation of a query.

Let the query be a DNF: [ q = T_1 \lor T_2 \lor \cdots \lor T_k, ] where each \(T_i\) is a conjunction (AND) of literals.

Then [ \lnot q = \lnot(T_1\lor\cdots\lor T_k) = (\lnot T_1)\land(\lnot T_2)\land\cdots\land(\lnot T_k). ]

Each \(\lnot T_i\) is equivalent to a clause \(C_i\) (an OR of literals), so we take the set of clauses [ \lbrace C_1,\dots,C_k\rbrace ] as the hypotheses and attempt to derive the empty clause \(\bot\).

  • The empty clause \(\bot\) is false in every possible world.
  • Deriving \(\bot\) means the hypotheses are unsatisfiable, i.e., \(\lnot q\) is unsatisfiable, hence \(q\) is valid (entailed) under the hypotheses.

Soundness intuition for refutations

If a resolution derivation reaches \(\bot\), then somewhere (explicitly or implicitly) we have derived both \(x_i\) and \(\lnot x_i\) for some variable.

If the starting clauses were all satisfiable by some assignment \(\bar x^*\), sound inference would preserve satisfiability along the derivation; but \(\bot\) is unsatisfiable. Therefore, reaching \(\bot\) implies there is no satisfying assignment to the starting clauses.